Search CORE

12 research outputs found

Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning

Author: Fan Changjie
Hao Jianye
Hu Yujing
Liu Jinyi
Lv Tangjie
Ma Yi
Zheng Yan
Publication venue
Publication date: 27/06/2023
Field of study

In recent years, data-driven reinforcement learning (RL), also known as offline RL, have gained significant attention. However, the role of data sampling techniques in offline RL has been overlooked despite its potential to enhance online RL performance. Recent research suggests applying sampling techniques directly to state-transitions does not consistently improve performance in offline RL. Therefore, in this study, we propose a memory technique, (Prioritized) Trajectory Replay (TR/PTR), which extends the sampling perspective to trajectories for more comprehensive information extraction from limited data. TR enhances learning efficiency by backward sampling of trajectories that optimizes the use of subsequent state information. Building on TR, we build the weighted critic target to avoid sampling unseen actions in offline training, and Prioritized Trajectory Replay (PTR) that enables more efficient trajectory sampling, prioritized by various trajectory priority metrics. We demonstrate the benefits of integrating TR and PTR with existing offline RL algorithms on D4RL. In summary, our research emphasizes the significance of trajectory-based data sampling techniques in enhancing the efficiency and performance of offline RL algorithms

arXiv.org e-Print Archive

Rethinking Noisy Label Learning in Real-world Annotation Scenarios from the Noise-type Perspective

Author: Fan Changjie
Lin Minmin
Liu Haoyu
Lv Tangjie
Wang Haobo
Wu Runze
Zhu Renyu
Publication venue
Publication date: 28/07/2023
Field of study

We investigate the problem of learning with noisy labels in real-world annotation scenarios, where noise can be categorized into two types: factual noise and ambiguity noise. To better distinguish these noise types and utilize their semantics, we propose a novel sample selection-based approach for noisy label learning, called Proto-semi. Proto-semi initially divides all samples into the confident and unconfident datasets via warm-up. By leveraging the confident dataset, prototype vectors are constructed to capture class characteristics. Subsequently, the distances between the unconfident samples and the prototype vectors are calculated to facilitate noise classification. Based on these distances, the labels are either corrected or retained, resulting in the refinement of the confident and unconfident datasets. Finally, we introduce a semi-supervised learning method to enhance training. Empirical evaluations on a real-world annotated dataset substantiate the robustness of Proto-semi in handling the problem of learning from noisy labels. Meanwhile, the prototype-based repartitioning strategy is shown to be effective in mitigating the adverse impact of label noise. Our code and data are available at https://github.com/fuxiAIlab/ProtoSemi

arXiv.org e-Print Archive

Examining the Effect of Pre-training on Time Series Classification

Author: Chang Yongzhu
Cheng Ling
Lv Tangjie
Pu Jiashu
Wu Runze
Zhang Rongsheng
Zhao Shiwei
Publication venue
Publication date: 11/09/2023
Field of study

Although the pre-training followed by fine-tuning paradigm is used extensively in many fields, there is still some controversy surrounding the impact of pre-training on the fine-tuning process. Currently, experimental findings based on text and image data lack consensus. To delve deeper into the unsupervised pre-training followed by fine-tuning paradigm, we have extended previous research to a new modality: time series. In this study, we conducted a thorough examination of 150 classification datasets derived from the Univariate Time Series (UTS) and Multivariate Time Series (MTS) benchmarks. Our analysis reveals several key conclusions. (i) Pre-training can only help improve the optimization process for models that fit the data poorly, rather than those that fit the data well. (ii) Pre-training does not exhibit the effect of regularization when given sufficient training time. (iii) Pre-training can only speed up convergence if the model has sufficient ability to fit the data. (iv) Adding more pre-training data does not improve generalization, but it can strengthen the advantage of pre-training on the original data volume, such as faster convergence. (v) While both the pre-training task and the model structure determine the effectiveness of the paradigm on a given dataset, the model structure plays a more significant role

arXiv.org e-Print Archive

Reinforcement Learning Experience Reuse with Policy Residual Representation

Author: Chen Yingfeng
Fan Changjie
Guan Kai
Lv Tangjie
Yu Yang
Zhou Wen-Ji
Zhou Zhi-Hua
Publication venue
Publication date: 31/05/2019
Field of study

Experience reuse is key to sample-efficient reinforcement learning. One of the critical issues is how the experience is represented and stored. Previously, the experience can be stored in the forms of features, individual models, and the average model, each lying at a different granularity. However, new tasks may require experience across multiple granularities. In this paper, we propose the policy residual representation (PRR) network, which can extract and store multiple levels of experience. PRR network is trained on a set of tasks with a multi-level architecture, where a module in each level corresponds to a subset of the tasks. Therefore, the PRR network represents the experience in a spectrum-like way. When training on a new task, PRR can provide different levels of experience for accelerating the learning. We experiment with the PRR network on a set of grid world navigation tasks, locomotion tasks, and fighting tasks in a video game. The results show that the PRR network leads to better reuse of experience and thus outperforms some state-of-the-art approaches.Comment: Conference version appears in IJCAI 201

arXiv.org e-Print Archive

Crossref

Variational Iterative Algorithms in Photoacoustic Tomography with Variable Sound Speed

Author: Tangjie Lv and Tie Zhou
Publication venue: 'Global Science Press'
Publication date
Field of study

Crossref

DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video

Author: Deng Wenjin
Ding Yu
Fan Changjie
Hu Zhipeng
Lv Tangjie
Zhang Zhimeng
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 26/06/2023
Field of study

For few-shot learning, it is still a critical challenge to realize photo-realistic face visually dubbing on high-resolution videos. Previous works fail to generate high-fidelity dubbing results. To address the above problem, this paper proposes a Deformation Inpainting Network (DINet) for high-resolution face visually dubbing. Different from previous works relying on multiple up-sample layers to directly generate pixels from latent embeddings, DINet performs spatial deformation on feature maps of reference images to better preserve high-frequency textural details. Specifically, DINet consists of one deformation part and one inpainting part. In the first part, five reference facial images adaptively perform spatial deformation to create deformed feature maps encoding mouth shapes at each frame, in order to align with input driving audio and also the head poses of input source images. In the second part, to produce face visually dubbing, a feature decoder is responsible for adaptively incorporating mouth movements from the deformed feature maps and other attributes (i.e., head pose and upper facial expression) from the source feature maps together. Finally, DINet achieves face visually dubbing with rich textural details. We conduct qualitative and quantitative comparisons to validate our DINet on high-resolution videos. The experimental results show that our method outperforms state-of-the-art works

Association for the Advancement of Artificial Intelligence: AAAI Publications

EasySM: A Data-Driven Intelligent Decision Support System for Server Merge

Author: Deng Hao
Huang Jie
Lv Tangjie
Qu Manhu
Shen Xudong
Tao Jianrong
Wu Runze
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 28/06/2022
Field of study

As an independent social and economic entity, game servers plays a dominant role in building a stable, living, and attractive virtual world in massive multi-player online role-playing games (MMORPGs). We propose and implement a novel intelligent decision support system for server merge (SM) for maintaining the game ecology at the macro level. The services provided by this system include server health diagnosis, server merge assessment, and combination strategy recommendation. Specifically, we design an effective time series prediction algorithm to diagnose the health status of one server (e.g., user activity, online time, daily revenue) based on real game scenarios, and then select the servers with poor status from all servers. Moreover, to dig out the inherent development laws of servers from the historical merge records, we leverage a correlation measurement algorithm to find the historical merged servers that are similar to the servers to be merged and then evaluate the potential trend after merging, which can assist experts to make reasonable decisions. We deploy our system into practice for multiple MMORPGs and achieve sound online performance endorsed by the game operation team

Association for the Advancement of Artificial Intelligence: AAAI Publications

StyleTalk: One-Shot Talking Head Generation with Controllable Speaking Styles

Author: Deng Zhidong
Ding Yu
Fan Changjie
Hu Zhipeng
Lv Tangjie
Ma Yifeng
Wang Suzhen
Yu Xin
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 26/06/2023
Field of study

Different people speak with diverse personalized speaking styles. Although existing one-shot talking head methods have made significant progress in lip sync, natural facial expressions, and stable head motions, they still cannot generate diverse speaking styles in the final talking head videos. To tackle this problem, we propose a one-shot style-controllable talking face generation framework. In a nutshell, we aim to attain a speaking style from an arbitrary reference speaking video and then drive the one-shot portrait to speak with the reference speaking style and another piece of audio. Specifically, we first develop a style encoder to extract dynamic facial motion patterns of a style reference video and then encode them into a style code. Afterward, we introduce a style-controllable decoder to synthesize stylized facial animations from the speech content and style code. In order to integrate the reference speaking style into generated videos, we design a style-aware adaptive transformer, which enables the encoded style code to adjust the weights of the feed-forward layers accordingly. Thanks to the style-aware adaptation mechanism, the reference speaking style can be better embedded into synthesized videos during decoding. Extensive experiments demonstrate that our method is capable of generating talking head videos with diverse speaking styles from only one portrait image and an audio clip while achieving authentic visual effects. Project Page: https://github.com/FuxiVirtualHuman/styletalk

Association for the Advancement of Artificial Intelligence: AAAI Publications